UNIVERSITY OF CALIFORNIA Los Angeles Study of Functionally Related Gene Groups Using Microarray Expression Data: Theory and Application

نویسندگان

  • Yijing Shen
  • Yingnian Wu
  • Steve Horvath
  • Mark Hansen
چکیده

Various clustering methods have been applied to microarray gene expression data to identify genes with similar expression profiles. Recently, as the biological annotation data accumulated, many genes have been organized into functional categories such as Gene Ontology. Because functionally related genes may be regulated by common cellular signals, and hence be co-expressed, how to utilize these rapidly increasing functional annotation resources to improve the performance of clustering methods is of great interest. In addition, some “scattered” genes may have distinct expression profiles and may not co-express with other genes. Identification of these “scattered” genes could further enhance the performance of clustering methods. We developed a new clustering algorithm, Dynamic Weighted Clustering with Noise set (DWCN), which makes use of the prior gene annotation information and allows for a set of scattered genes (“noise”) be left out of the main clusters. This could significantly improve the quality of the final clusters especially when a considerable percent of such noisy genes is present in the data. An application in a yeast cell-cycle gene expression data demonstrated that our method could produces clusters with more consistent functional annotation as well as more coherent expression pattern than traditional clustering techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Review of Surface-Enhanced Raman Spectroscopy on Potential Clinical Applications Towards Diagnosing Colorectal Cancer

Colorectal cancer (CRC) is one of the leading cancers in the world and early-screening is still the best method of cancer patient survival. However, colonoscopy as the current gold standard is not without flaws and an emerging technique called surface-enhanced Raman spectroscopy (SERS) coupled with machine learning is a possible candidate that could be applied in parallel with colonoscopy. This...

متن کامل

A Review of Surface-Enhanced Raman Spectroscopy on Potential Clinical Applications Towards Diagnosing Colorectal Cancer

Colorectal cancer (CRC) is one of the leading cancers in the world and early-screening is still the best method of cancer patient survival. However, colonoscopy as the current gold standard is not without flaws and an emerging technique called surface-enhanced Raman spectroscopy (SERS) coupled with machine learning is a possible candidate that could be applied in parallel with colonoscopy. This...

متن کامل

Keratin 13 is a more specific marker of conjunctival epithelium than keratin 19

Introduction To evaluate the expression patterns of cytokeratin (K) 12, 13, and 19 in normal epithelium of the human ocular surface to determine whether K13 could be used as a marker for conjunctival epithelium. Methods: Total RNA was isolated from the human conjunctiva and central cornea. Those transcripts that had threefolds or higher expression levels in the conjunctiva than the cornea wer...

متن کامل

Integration and Reduction of Microarray Gene Expressions Using an Information Theory Approach

The DNA microarray is an important technique that allows researchers to analyze many gene expression data in parallel. Although the data can be more significant if they come out of separate experiments, one of the most challenging phases in the microarray context is the integration of separate expression level datasets that have gathered through different techniques. In this paper, we prese...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008